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I . INTRODUCTION 

4 

The  procedure  usually  called  "signal  processing*  may  be 
factored  into  two  parts* 

(a)  data  interpretation  and 
(^b)  decision  making. 

It  is  the  contention  and  thesis  of  this  paper  that  only  the 
former  is  the  proper  realm  of  the  signal  processor;  that 
decision  making  is  a line  or  command  function  while  signal 
processing  is  an  interpretive  or  staff  function;  and  that 
confusion,  misunderstanding,  and  inefficient  system  design 
result  when  these  two  separable  notions  are  mingled  and 
confounded. 

It  is  not  my  position  that  signal  processors  should  never 
make  decisions,  nor  do  I claim  that  tactical  or  line 
commanders  should  not  engage  in  signal  interpretation. 
However,  I do  insist  that  (these  functions  being  separate) 
those  who  engage  in  both  should  know  at  each  moment  which 
role  they  then  are  playing. 
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The  deep-seated  confusion  about  this  central  issue  is 
reflected  in  the  naive  terminology  which  pervades  the  world 
of  the  signal  processor-user.  For  instance: 

Detection:  At  some  times  this  means  "rectification"; 
at  others  it  means  that  a voltage  has  exceeded  a 
"preset"  (but  probably  unspecified)  threshold;  at 
yet  others  it  means  that  a target  (of  some  probably 
agreed  upon  sort,  probably  unspecified)  has  been 
"observed"  and  "recognized."  A roomful  of  signal 
processors  and  users  will  probably,  in  a 10-minute 
period,  use  the  word  "detection"  in  at  least  two  of 
these  three  senses,  with  at  least  a 50%  chance  of 
misunderstanding. 

False  alarm:  This  means  ordinarily  that  a "detection" 
[in  the  second  sense  above]  has  occurred  when  the 
user  of  the  word  "false  alarm"  wishes  it  hadn't. 

The  tendency  for  users  to  set  the  words  "detection" 
and  "false  alarm"  in  opposition  shows  the  extent 
of  the  terminological  blur.  Actually,  of  course, 
"false  alarms"  are  a subset  of  "detections." 

The  fact  is  that  received  data--in  the  form  of  radio 
messages,  sonar  or  radio  echoes,  or  whatever- -serve  to 
revise  our  concept  of  the  probabilities  which  describe  what 
we  believe  about  the  world.  It  is  the  signal  processor's 
job  to  assess  how  probable  it  is  that  we  would  get  the 
data--messages , echoes,  or  whatnot --which  we  have  actually 
gotten,  subject  to  all  the  various  possible  exclusive 
alternate  hypotheses  about  what  may  be  true. 
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The  next  step  is  to  combine  these  results  with  a pre-message 
probabilistic  description  of  the  situation,  and  so  to 
produce  revised,  updated,  and  improved  descriptions.  These 
revised  descriptions  (in  the  form  of  probabilities)  then 
enable  those  concerned  with  decision  making  to  make  the  best 
decisions  possible  on  the  basis  of  available  information. 

For  instance,  consider  the  following  idealized  and  simplified 
problem:  Suppose  there  is  one  target  only  and  that  there 

are  n locations  at  which  this  target  may  be.  Suppose  that, 
after  processing  all  signals  available,  an  ideal  processor 
concludes  that  the  probability  the  target  is  not  present  at 
all  is  Pq,  the  probability  the  target  present  and  is  in 
location  #1  is  the  probability  the  target  is  in  location 
#2  is  Q 2^  etc.  The  p vector  can  be  used  as  input  to  many 
tactical  problems.  Suppose  the  target  is  of  interest  to  us 
only  in  that  it  presents  a threat  to  ourselves.  Suppose  we 
wish  to  attain,  as  a minimum,  a 98%  probability  of  survival. 
Suppose  we  have  devices  (probably  expensive)  with  which  we 
can  destroy  the  target  at  any  given  location  if  we  choose 
and  if  it  is  really  there.  Then,  clearly,  if  Pq  exceeds  .98, 
we  need  expend  no  ammunition  under  these  rules.  If  Pq  < .98, 
we  note  that 

Pq  - 1 - [»i  + P2  + ...  + Pn]  • 

and  we  attack  the  location  with  the  largest  pj.  If  the 
elimination  of  Pj  is  sufficient  to  raise  pQ  above  .98, 
attacking  one  location  is  sufficient.  If  not,  we  attack 
the  next  largest  p as  well,  etc.,  until  the  resultant  Pq 
exceeds  .98. 
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By  this  procedure  we  attain  a survival  probability  i 98% 
while  minimizing  our  use  of  ammunition. 

Some  specific  discussion  of  the  signal  processor's  job  is 
given  in  section  II  below,  and  a few  examples  of  the  results 
from  a simple  optimal  signal  interpreter  are  given.  Then, 
in  section  III,  some  questions  are  raised  which  we  feel 
merit  serious  study. 


6500  TRACOR  LANE.  AUSTIN,  TEXAS  78721 


PROBABILITY,  LIKELIHOOD  RATIOS,  AND  RELATED  TOPICS 

The  signal  processor's  proper  job  lies  in  the  gap  between 
data  and  decision.  His  function  is  to  distill  from  the 
data  the  best  possible  probabilistic  description  of  the 
current  situation.  In  order  to  do  this,  the  signal 
processor  must  know  how  the  statistics  of  the  received 
data  should  vary  depending  upon  the  situation--for  instance, 
in  a typical  case,  he  must  know  the  statistical  description 
of  noise  on  the  one  hand  and  of  noise  plus  target  on  the 
other. 

Strictly  speaking,  if  the  signal  processor  is  to  provide  a 
probabilistic  interpretation  of  a message,  he  must  have  one 
other  input,  the  a priori  probabilities  which  describe  the 
pre-message  situation. 

Let  us  consider  the  simplest  sort  of  example.  Suppose  we 
are  dealing  with  a two-altemative  problem.  Suppose  the 
a priori  probability  of  event  is  and  the  a priori 
probability  of  event  is  a2  = 1 - Suppose  we  receive 

a message  X.  (X  may  be  a number,  a telegram,  a matrix,  or 
anything.)  Let  be  the  probability  that  we  receive 
message  X if  alternative  #1  is  true.  Let  B 2 be  the 
probability  of  receiving  message  X if  alternative  is 
true.  Then,  the  message  X tells  us  that  the  probability 
that  alternative  #1  is  the  case  is 

ttiBi 

Pi  " + a2e2 
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and  for  alternative  yA2  the  probability  is 


= °^2^2 
^2  “ °^2®2 


Many  users  of  data  processing  outputs  are  loath  to  state 
a priori  probability  values  for  the  data  processors.  This 
is  understandable,  but  the  result  is  that,  without  a's,  p's 
cannot  be  computed.  [Ask  a hypothetical  question  and 
you  get  a hypothetical  answer.]  Actually,  any  user  of 
signal  processing  must  have-~if  not  specific  a priori 
probabilities  in  mind--at  least  a range  or  zone  of  values 
somewhere  between  0 and  1 in  mind.  Otherwise,  reflection 
shows  he  would  have  no  need  for  signal  processing. 


Now,  in  the  two-altemative  case  at  hand,  this  problem  may 
be  neatly  circumvented.  A little  algebra  shows  that 


so  that  the  effect  of  the  likelihood  ratio 


® 1 

X—  upon  the  whole 
®2 


family  of  possible  a values  can  be  simply  graphed,  since  the 
logarithms  of  the  quantities  shown  are  linearly  related. 
Figure  I shows  such  a graph,  and  shows  how  a given  likelihood 
ratio  transforms  any  a into  the  resulting  p.  The  diagonal 
lines  are  labeled  with  logs  (to  the  base  10)  of  the  likeli- 
hood ratios.  Note  that  sequential  independent  messages  may 
be  treated  as  a single  combined  message  by  multiplying 
together  the  likelihood  ratios  involved.  For  instance. 
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suppose  that  we  conduct  two  independent  experiments  or 
receive  two  independent  messages.  Suppose  that  the  likelihood 
ratio  resulting  from  the  first  is  100,  and  that  from  the 
second  is  10.  The  first  experiment  converts  the  a priori 
probability  of  .02  to  .67  (see  chart).  Entering  with  our 
revised  opinion  of  .67  as  the  new  a,  the  second  experiment 
(which  produces  the  likelihood  ratio  10)  converts  .67  to  .95 
which  (happily)  we  note--on  further  scrutiny  of  the  chart-- 
is  the  same  answer  we  would  have  gotten  directly  by  going 
from  .02  to  the  line  for  likelihood  equals  1000. 

As  one  might  expect,  things  get  more  complicated  when  we 
set  out  to  consider  situations  with  more  than  two  alterna- 
tives. A certain  amount  of  geometric  visualization  is  of 
use  here.  Note  that,  in  the  two-alternative  case  we  have 
been  considering,  the  universe  consists  of  the  straight 
line  interval  connecting  the  points  (1,0)  and  (0,1)  in  the 
plane.  For  the  three-alternative  problem,  the  universe  is 
the  triangle-plus-its-interior  lying  in  the  first  octant 
with  vertices  at  (1,0,0),  (0,1,0),  and  (0,0,1).  For  the 
four-alternative  case,  the  universe  is  the  surface-and- 
interior  of  the  regular  tetrahedron  with  vertices  (1,0, 0,0) 
(0,1, 0,0)  (0,0, 1,0)  and  (0,0, 0,1).  Etc. 

Consider  the  general  case,  for  the  n alternative  problem. 

Once  a message  is  interpreted  into  a 0 vector--where  for 
each  j,  6j  is  proportional  to  the  probability  that  we 
would  have  received  that  message  if  alternative  j were 
true — then  the  revised  probabilities  (p*)  are  derived  from 
the  a priori  probabilities  (a^)  by  the  expressions 
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8 -a . 


D = j -j 

j + ®2°’2  ^11% 


for  j=l,  2,  3,  ...,n 

Since  multiplying  each  8 by  the  same  non-zero  scale  factor 
has  no  effect  on  the  resultant  a-to-p  transformation,  it  is 
convenient  to  rescale  the  6 values  such  that  the  sum  of  the 
6's  is  1.  If  we  adopt  this  convention,  and  if  we  let  be 
the  space  we  have  defined  for  the  n alternative  problem, 
then  the  following  simple  descriptive  remarks  are  true: 

1)  The  content  of  any  message,  once  interpreted,  is 
represented  by  a single  point,  8,  in  S^. 

2)  8 defines  a continuous  transformation  which  maps  into 
itself. 

3)  If  8 is  an  interior  point  of  S^,  then  the  transformation 
is  reversible,  and  the  collection  of  all  such  transforma- 
tions is  a commutative  group.  [Note:  The  unit  element 


I = f-  - 

n |_n’  n’  ‘ ’ n J 


and  for  an  interior  point 


8 (^®1’  ®2’  •••>  ®n^  ’ 
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if  we  let 


then  the  transformation  defined  by  the  point 


is  the  inverse  of  the  transformation  defined  by  the 
point  8 . ] 

4)  If  8 lies  on  the  boundary  of  S^--i.e.,  if  one  component 
of  8 is  0--then  the  transformation  defined  by  8 maps 
onto  a part  of  its  boundary,  and  reduces  the  dimension 
of  the  problem.  This  transformation  is  not  reversible. 
The  physical  significance  is  that,  in  effect,  one 
alternative  has  been  totally  ruled  out. 

Thus,  a message  is  described  as  a point;  and  each  message 
defines  a mapping  of  into  itself.  We  may  consider  each 
point  in  its  initial  position  as  representing  a possible 
a priori  situation,  and  in  its  terminal  or  transformed 
position  as  representing  the  revised  probabilities  resulting 
from  the  message.  Figures  II-A,  II-B,  etc.,  and  III-A, 
III-B,  etc.,  show  some  examples  of  a three-alternative 
problem.  The  conditions  were  these: 

1)  There  was  one  target  located  at  one  apex  of  the  triangle 
[actually  in  the  lower  left,  but  the  processor  didn't 
know] . 
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2)  The  Monte  Carlo  signals  received  were  exponentially 
distributed  with  means  N for  noise-alone  and  S + N for 
for  target-plus-noise. 

3)  The  a priori  probabilities  used  to  define  the  starting 

point  were  i.e.,  total  uncertainty. 

4)  The  program  received  successive  signals  and  updated  the 
derived  probabilities  until  within  .05  of  the  correct 
answer.  The  value  of  j shown  in  each  figure  is  the 
number  of  steps  required  to  reach  this  degree  of  nearness 
to  the  goal. 

I have  shown  a large  number  of  examples  to  hint  at  the  wide 
variety  of  things  which  can  happen.  In  the  words  of 
Mr.  J.  R.  Wright,  "He  who  deals  with  probability  must  be 
prepared  to  take  a chance." 

In  the  multialternative  situations  there  is  not  in  general 
any  simple  way  to  factor  the  a's  and  p's  into  separate 
terms.  In  fact,  the  best  factoring  we  can  find  is  of  the 
form 


/ Pl_\  (I  - a^\  _ [^2  + ^3  + ...  + g^]  _ 

\ " ^1/  \ "^l  / °'3®3 

In  many  cases,  the  a priori  probability  distribution  is  of 
a special  sort.  For  instance: 

If  there  are  n locations  where  a target  might  be. 

If  there  is  but  one  target. 
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If  the  a priori  probability  the  target  is  in  none  of 
these  locations  is  a. 

And  if  the  target-present  probability  is  distributed 
uniformly  over  the  n locations. 

Then  we  are  dealing  with  an  n + 1 alternative  problem, 
thus : 

aQ  = a = probability  of  no  target 


thi 

= a priori  probability  the  target  is  in  the  n^  location, 
so  the  last  expression  becomes 


This  is  almost  the  familiar  form  which  leads  to  the  chart 
of  Figure  I,  except  that  we  must  now  use  the  expression 

I ®0  \ 

n I g ^ g—  sort  of  modified  likelihood  ratio.  For 

example,  suppose  we  are  receiving  signals  from  1000  resolvable 
locations,  so  that  n is  1001.  Suppose  that  after  the  signals 
are  processed  the  normalized  Bj^,  B 2>  •••>  *90909. 

Then  6q  is  .09090,  and  the  virtual  likelihood  ratio  to  use 

is  1000  ( ~ 100.  This  converts  the  a priori  probability 

of  .95  that  no  target  is  present  into  .9994+. 
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III.  QUESTIONS 

The  essential  vulnerability  of  signal  processing  to  the 
a priori  description  may  not  be  avoided--ask  a wrong 
question  and  you  get  a wrong  answer.  The  rightness  or 
wrongness  of  the  a priori  description,  however,  does  not 
come  within  the  purview  of  the  signal  processor.  Another 
matter  his  concern  and  is  a topic  of  great  importance. 
This  is  the  question  "What  do  errors  in  the  signal 
processor's  notion  of  the  signal  probability  distributions 
do  to  his  interpretations?"  This  very  broad  question  gives 
rise  to  a number  of  specific  ones  of  considerable  interest, 
a few  of  which  are  listed  below: 

1)  In  a typical  signal  processing  situation,  the  forms 
of  the  distributions  of  signal  plus  noise  and  of 
noise  alone  are  known,  but  the  values  of  some  of 
the  defining  parameters  may  be  known  only  approxi- 
mately. For  instance,  suppose  the  signal  plus 
noise  is  Rayleigh  distributed  in  voltage  amplitude 
(or  exponentially  distributed  in  power)  while  the 
noise  alone  is  also  exponentially  distributed  in 
power.  Then  when  we  receive  a single  echo  power 
of  X,  the  likelihood  ratio  to  use  is  obviously 

Sx 

N _N(S  + N) 

S + N ^ 

But  what  happens  if  we  do  not  know  the  precise 
value  of  S to  use?  If  we  use  an  S which  is  too 
big,  obviously  this  will  cause  us  to  overrate  the 
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importance  of  the  received  signal  x.  If  we  use  too 
small  an  S,  the  converse  will  hold.  Qualitatively 
we  can  see  a reassuring  trend  toward  stability 
here,  since  if  we  use  too  big  an  S the  actual  x's 
we  get  will  tend  to  be  small  in  comparison  to  those 
we  would  have  gotten  had  the  S been  correct--so 
that  the  overemphasis  will  be  an  overemphasis  of 
rather  understated  signals.  The  real  challenge  is 
to  put  this  matter  in  quantitative  terms.  Work 
needs  to  be  done  to  determine  the  sensitivity  of 
such  an  interpreter  to  errors  in  our  assessment 
of  S. 

2)  A completely  similar  exercise  needs  to  be  carried 
out  for  the  Rayleigh  noise  and  Rayleigh-Rice 
target  situation. 

3)  The  following  proposition  needs  to  be  investigated-- 
it  may  be  a theorem.  "If  d and  f are  the  true 
signal-plus-noise  and  noise-alone  distributions 
describing  a message  situation,  and  if  d'  and  f' 
are  erroneous  descriptions  used  by  the  signal 
processor  in  interpreting  the  message,  then  there 
exists  a nonreversible  transformation  T(x)  = y 

such  that  if  x is  distributed  according  to  d,  y is 
distributed  according  to  d ' and  such  that  if  x is 
distributed  according  to  f,  y is  distributed 
according  to  f'."  If  this  theorem  is  true,  then 
errors  in  interpretation  arising  from  mistaken 
concepts  of  d and  f may  be  thought  of  in  the  same 
light  as  are  errors  produced  by  irreversible  and 
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information  destroying  distortions  of  the  received 
wave  forms . 


In  order  to  carry  out  a systematic  investigation  of  these 
and  related  questions,  we  need  to  agree  upon  a general 
measure  of  the  meaningfulness  of  an  interpreted  message. 

As  a candidate  I propose  the  following  measure: 

A.  For  the  n alternative  problem  in  which  we  know  that 

there  is  one  and  only  one  target  present  and  in  which, 
after  we  have  interpreted  the  messages  available  to  us 
as  best  we  can,  our  estimate  that  the  target  is  in  the 
j location  is  p^,  I propose  the  measure 


a - 


1 1 


'j 


Note  that  if  ^ , Q = n.  If  one  of  the  p^  is  1, 

and  the  rest  are  0,  then  (in  the  limit)  Q = 1. 

Q stands  for  quandary.  It  is  a measure  related  to  how 
much  more  we  would  need  to  know  to  pin  down  the  target 
location  completely.  An  optimal  processor  reduces  Q as 
much  as  possible.  Other  processors  less  so.  Therefore, 

Q is  a good  measure  of  the  goodness  of  a processor. 

B.  For  the  n + 1 alternative  case  where  the  extra  alterna- 
tive covers  the  possibility  the  target  is  not  there  at 
all,  the  problem  is  more  complex.  One  alternative  would 
be  to  define  Q as  before;  but  I feel  the  need  to  segregate 
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the  target-not-present  case  from  the  target-present- 
but-in-an-unknown- location  case.  One  possibility  is  to 
re-normalize  the  target  present  probabilities  thus 


= 


J n 


£=1 


and  then  let 


U'' 


P . i»TV 

J P. 

j=i  J 


where  Pq  is  the  derived  probability  the  target  is  absent. 

Note  that  when  we  know  the  target  is  present,  U *=  Q, 
and  that  otherwise  U is  the  probability  that  a target 
is  present  times  our  quandary  regarding  where  it  is. 

U stands  for  urgency. 


It  is  my  hope  and  belief  that  these  definitions  will  prove 
heuristic  in  our  investigation  of  the  interpretation 
problems,  and  will  lead  to  useful  results  directly  applicable 
to  systems  planning. 
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